Speech interface for name input based on combination of recognition methods using syllable-based n-gram and word dictionary
نویسندگان
چکیده
We propose an interface for a name input based on speech recognition using syllable-based N-gram and a word dictionary. Name utterance is hard to recognize accurately because of the large vocabulary size, so the system uses continuous syllable recognition with syllable-based N-gram and isolated word recognition with a dictionary containing frequent words. User first utters a name and then chooses the correct word/syllables by pen touch from word/syllable candidates which were obtained from speech recognition. System displays word candidates, syllable sequence candidates and a syllable lattice on a touch panel and user can select a desired word from the candidates. We evaluated this interface. User could find the correct answer from word candidates or syllable sequence candidates at a rate of 82-86%, and could input correct name at a rate of 94-96% using syllable selection from the syllable lattice. Some subjects used this interface and felt that it was efficient and useful.
منابع مشابه
Syllable-based Speech Recognition System for Myanmar
This proposed system is syllable-based Myanmar speech recognition system. There are three stages: Feature Extraction, Phone Recognition and Decoding. In feature extraction, the system transforms the input speech waveform into a sequence of acoustic feature vectors, each vector representing the information in a small time window of the signal. And then the likelihood of the observation of featur...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملSopoken Term Detection Based on a Syllable N-gram Index at the NTCIR-11 SpokenQuery&Doc Task
For spoken term detection, it is crucial to consider out-ofvocabulary (OOV) and the mis-recognition of spoken words. Therefore, various sub-word unit based recognition and retrieval methods have been proposed. We also proposed a distant n-gram indexing/retrieval method for spoken queries, which is based on a syllable n-gram and incorporates a distance metric in a syllable lattice. The distance ...
متن کاملHeuristic Syllabification and Statistical Syllable-Based Modeling for Speech-Input Topic Identification
We describe a heuristic syllabification method and the use of a statistical syllable n-gram language model for discriminating between a closed set of topics. The syllabification method works by assigning costs to consonant clusters and then splitting the clusters where the cost is minimized. We apply the syllabification on a pronunciation dictionary which maps words to phone sequences; the resu...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کامل